108 results found.
Written
Corpus,
Language Type:
Multilingual
Languages:
Chinese English French German Japanese Korean Russian Spanish
Availability:
Freely Available
License:
CC-BY-4
Size:
68000000 sentences Production Status:
Newly created-finished
Use:
Machine Translation, SpeechToSpeech Translation
-
Paper title:ParaPat: The Multi-Million Sentences Parallel Corpus of Patents Abstracts
-
Paper track:Written/oral presentation
-
Paper status:Accept Oral
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Felipe Soares | ParaPat | /N |
Documentation:
None
Written
Corpus,
Language Type:
Monolingual
Languages:
Chinese
Availability:
Freely Available
License:
Size:
1 MByte Production Status:
Newly created-finished
Use:
Document Classification, Text categorisation
-
Paper title:Ciron: a New Benchmark Dataset for Chinese Irony Detection
-
Paper track:Evaluation/oral presentation
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Yunfei Long | Ciron: a New Benchmark Dataset for Chinese Irony Detection | /N |
Documentation:
None
Written
Corpus,
Language Type:
Bilingual
Languages:
Chinese Portuguese
Availability:
From Owner
License:
NA
Size:
800000 sentences Production Status:
Newly created-finished
Use:
Machine Translation, SpeechToSpeech Translation
-
Paper title:Corpora for Document-Level Neural Machine Translation
-
Paper track:Evaluation/oral presentation
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Longyue Wang | Chinese-Portuguese Document-Level Corpus | /N |
Documentation:
None
Speech/Written
Corpus,
Language Type:
Multilingual
Languages:
Chinese Dutch French German Italian Mongolian Persian Russian Spanish Swedish Turkish
Availability:
Freely Available
License:
CC0
Size:
700 hours Production Status:
Newly created-finished
Use:
Machine Translation, SpeechToSpeech Translation
-
Paper title:CoVoST: A Diverse Multilingual Speech-To-Text Translation Corpus
-
Paper track:Speech/oral presentation
-
Paper status:Accept Oral
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Changhan Wang | CoVoST | /N |
Documentation:
https://github.com/facebookresearch/covost
Written
Corpus,
Language Type:
Multilingual
Languages:
Arabic Chinese English
Availability:
LDC
License:
Size:
None words Production Status:
Existing-used
Use:
-
Paper title:Cross-lingual Zero Pronoun Resolution
-
Paper track:Evaluation/oral presentation
-
Paper status:Accept Oral
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Abdulrahman Aloraini | OntoNotes 5.0 | /N |
Documentation:
None
Written
Lexicon,
Language Type:
Multilingual
Languages:
Chinese English French Japanese Portuguese
Availability:
Freely Available
License:
CreativeCommons
Size:
700 words Production Status:
Newly created-finished
Use:
Lexicon Creation/Annotation
-
Paper title:Linking the TUFS Basic Vocabulary to the Open Multilingual Wordnet
-
Paper track:Written/oral presentation
-
Paper status:Accept Poster+DemoSuggested
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Francis Bond | TUFS Vocabulary | /N |
Documentation:
None
Written
Lexicon,
Language Type:
Multilingual
Languages:
Albanian Arabic Basque Bulgarian Catalan Chinese Croatian Danish Dutch English Finnish French Galician Greek Hebrew Icelandic Indonesian Italian Japanese Lithuanian Malay Norwegian Persian Polish Portuguese Romanian Slovak Slovene Spanish Swedish Thai
Availability:
Freely Available
License:
Multiple Licenses
Size:
1072646 synsets Production Status:
Existing-used
Use:
All of the above
-
Paper title:Some Issues with Building a Multilingual Wordnet
-
Paper track:Infrastructural Issues/Large Projects/oral presentation
-
Paper status:Accept Oral
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | John P. McCrae | Open Multilingual WordNet | /N |
Documentation:
None
Speech
Corpus,
Language Type:
Multilingual
Languages:
Basque Breton Catalan Chinese English
Availability:
Freely Available
License:
CC0
Size:
2,500 hours Production Status:
Newly created-in progress
Use:
Speech Recognition/Understanding
-
Paper title:Common Voice: A Massively-Multilingual Speech Corpus
-
Paper track:Speech/oral presentation
-
Paper status:Accept Oral
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Josh Meyer | Common Voice | /N |
Documentation:
https://voice.mozilla.org/en/datasets




